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Abstract 

New inequalities are proved for the variance of the Pitman estimators 
(minimum variance equivariant estimators) of 9 constructed from sam- 
ples of fixed size from populations F(x—6). The inequalities are closely 
related to the classical Stam inequality for the Fisher information, its 
analog in small samples, and a powerful variance drop inequality. The 
only condition required is finite variance of F; even the absolute con- 
tinuity of F is not assumed. As corollaries of the main inequalities for 
small samples, one obtains alternate proofs of known properties of the 
Fisher information, as well as interesting new observations like the fact 
that the variance of the Pitman estimator based on a sample of size n 
scaled by n monotonically decreases in n. Extensions of the results to 
the polynomial versions of the Pitman estimators and a multivariate 
location parameter are given. Also, the search for characterization of 
equality conditions for one of the inequalities leads to a Cauchy-type 
functional equation for independent random variables, and an interest- 
ing new behavior of its solutions is described. 

Keywords: Fisher information, location parameter, monotonicity of the vari- 
ance, Stam inequality. 

* Corresponding author 



1 



1 Introduction 



Our goal is to present some new inequalities for the variance of the Pitman 
estimators of a location parameter from different related samples. 

Denote by t n the Pitman estimator (i.e., the minimum variance equiv- 
ariant estimator) of 9 from a sample (x\, . . . ,x n ) of size n from population 
F{x — 9). For simplicity, we first focus on the univariate case, i.e., Xi G M. 
If f x 2 dF(x) < oo, the Pitman estimator can be written as 



where x is the sample mean and E denotes the expectation with respect to 
F(x) (i. e., when 9 = 0). 

For the univariate case, if F' = f exists, t n can be also written as 



showing that t n is a generalized Bayes estimator corresponding to an im- 
proper prior (uniform on the whole R). In this paper the representation (|2|) 
crucial in studying the behavior of t n in large samples will not be used. 

In Section [2j we obtain a relationship between the variances of the Pit- 
man estimators based on data obtained by adding (convolving) the initial 
samples. As an application of this inequality, one obtains a new proof of a 
Fisher information inequality related to the central limit theorem. Another 
application, to distributed estimation using sensor networks, is described 
elsewhere [15] , 

If t n 5 • • • i tn denote the Pitman estimators from samples of size n from 
F\(x — 0), . . . , Fn(x — 9), and t n is the Pitman estimator from a sample of 
size n from F(x — 9) where F = F\ * . . . * Fn, Kagan |10j showed the 
superadditivity property 



In Section [3j we obtain this as a corollary of the main inequality in Sec- 
tion [21 and study an analytic problem arising in connection with identifying 
its equality conditions. In particular, a version of the classical Cauchy func- 
tional equation for independent random variables is studied; the behavior of 
this equation turns out to be more subtle than in the usual settings. 

In Section HI various inequalities relevant to estimation from a combi- 
nation of samples are given. For instance, for the Pitman estimator t m + n 





/ u Yli f( x i ~ u ) du 
IWi f(xi -u)du 



(2) 




(3) 



2 



constructed from observations x\, . . . , x m , y%, . . . , y n where the first m ob- 
servations come from F(x — 9) and the last n from G(x — 9), 



var(t m+n ) var(t m ) var(i„)' 

where t m and t n denote the Pitman estimators constructed from x±, . . . , x m 
and yi, . . . ,y n respectively. A generalization of this inequality has an inter- 
esting application to a data pricing problem (where datasets are to be sold, 
and the value of a dataset comes from the information it yields about an 
unknown location parameter); this application is described by the authors 
elsewhere [15] . 

As an application of the inequalities proved in Section U we prove in 
Section [5] that for any n > 1, with t n now denoting the Pitman estimator 
constructed from x\,...,x n for any n, 

nvar(t n ) > (n + l)var(t n+ i) (5) 

with the equality sign holding for an n > 2 only for a sample from Gaussian 
population (in which case nvar(t n ) is constant in n). 

If (x±, . . . , x n ) is a sample from s-variate population F(x — 9), x, 9 € W 
with J KS \x\ 2 dF(x) < oo, the Pitman estimator is defined as the minimum 
covariance matrix equivariant estimator. Though there is only partial or- 
dering in the set of covariance matrices, the set of covariance matrices of 
equivariant estimators has a minimal element which is the covariance ma- 
trix of the Pitman estimator (p} of the s-variate location parameter. Multi- 
variate extensions of most of the inequalities mentioned above are given in 
Section [U 

Assuming f x 2k dF(x) < oo for some integer k > 1, the polynomial 
Pitman estimator i„, of degree k is, by definition, the minimum variance 
equivariant polynomial estimator (see Kagan [H]). An advantage of the 
polynomial Pitman estimator is that it depends only on the first 2k mo- 
ments of F. In Section [71 it is shown that the polynomial Pitman estimator 
preserves almost all the properties of t n that are studied here. 

In Section [8]the setup of observations x±, . . . ,x n additively perturbed by 
independent yi, ■ ■ ■ ,y n with self-decomposable distribution function G(y/X) 
is considered. For the Pitman estimator t n \ from a sample of size n from 
F\(x — 9) where F\(x) = J F(x — u)dG{u/X) we prove that var(i nj A) as a 
function of A, monotonically decreases on (— oo, 0) and increases on (0, +oo). 
This makes rigorous the intuition that adding "noise" makes estimation 
harder. 
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Section [9] concludes with some discussion of the issues that arise in con- 
sidering various possible generalizations of the results presented in this pa- 
per. 

1.1 Related literature 

All our results have direct counterparts in terms of the Fisher information, 
and demonstrate very close similarities between properties of the inverse 
Fisher information and the variance of Pitman estimators. 

Denote by I(X) the Fisher information on a parameter 9 £ M contained 
in an observation X + 9. Plainly, the information depends only on the 
distribution F of the noise X but not on 9. 

For independent X, Y the inequality I(X + Y) < I(X) is almost trivial 
(an observation X + Y + 9 is "more noisy" than X + 9). A much less trivial 
inequality was proved in Stam |20j : 



I(X + Y) ~ I(X) I(Y)' y 1 

In Zamir |21j . the Stam inequality is obtained as a direct corollary of the 
basic properties of the Fisher information: additivity, monotonicity and 
reparameterization formula. 

The main inequality in Section [2] is closely related to the classical Stam 
inequality for the Fisher information, its version in estimation and a powerful 
variance drop inequality proved in a general form in Madiman and Barron 
|17j (described below). In Artstein et. al. pQ and Madiman and Barron [T7] 
the variance drop inequality led to improvements of the Stam inequality. 

Let now F{x) = (F 1 * F 2 )(x) = f F 1 (y)dF 2 (x - y) and t' n , t n be the 
Pitman estimators from samples of size n from F\{x — 9), F 2 (x — 9) and 
F(x — 9), respectively. If f x 2 dF(x) < oo, the following inequality holds for 
the variances (Kagan |10j): 

var(t n ) > var(^) + var(t"). (7) 

This inequality is, in a sense, a finite sample version of ©, as discussed in 
Kagan [10]. It is generalized in Section [21 and its equality conditions are 
obtained in Section El 

Several of the results in this paper rely on the following variance drop 
lemma. 

Lemma 1. Let X±, . . . , Xn be independent (not necessarily identically dis- 
tributed) random vectors. For s = {ii, ■ ■ ■ ,i m } C {!,..., iV} set X s = 
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(Xjj , . . . , Xi m ), with i\ < %2 < ••• < i m without loss of generality. For 
arbitrary functions S (X S ) with var{0 s (X s )} < oo and any weights w s > 
O'Es^s = 1, 

var |^u; s ^ s (X s )| < ^ ^ u; s 2 var{^ s (X s )} (8) 

where the summation in both sides is extended over all unordered sets ( com- 
binations) s of m elements from {1, . . . , N}. 

The equality sign in (0|) holds if and only if all S (X S ) are additively decom- 
posable, i.e., 

S (X S ) = Y^^{Xi). (9) 

ids 

The main idea of the proof goes back to Hoeffding [5] and is based on 
an ANOVA type decomposition, see also Efron and Stein |3J. See Artstein 
et. al. pQ for the proof of Lemma Q] in case of m = N — 1, and Madiman 
and Barron [T7] for the general case. In Section EJ we observe that this 
lemma has a multivariate extension, and use it to prove various inequalities 
for Pitman estimation of a multivariate location parameter. 

The main inequality of Section [5] is also related to Carlen's [3] superaddi- 
tivity of Fisher information, as touched upon there. See [8] for the statistical 
meaning and proof of Carlen's superadditivity. 

2 Convolving independent samples from different 
populations 

Here we first prove a stronger version of superadditivity (J7|). 

Let Xk = (xki, • • • , Xkn)i k = 1, . . . , N be a sample of size n from popu- 
lation Fk(x — 9). Set 

x kl + • • • + Xkn j-> / - - \ 2 / \ 

x k = , Rk = [Xki - x k , . . .,Xkn -Xk), o- k = var(x fci J, 

n 

and for s = . . . ,i m } C {1, . . . , N}, 

F s (x) = (F h *...* F im )(x), x s = ^ x fc , R s = ^2 Rk (componentwise). 
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Also set 



F(x) = (F 1 *. 


• • * ^jv)(a; 


X = X 1 + . . 


• + x N , 


R = Ri + . 


.. + Rn, 


= 01 + . . 


■ + a%. 



We will need the following well known lemma (see, e.g., 

Lemma 2. Let ^ be a random variable with E\£\ < oo and rj2 arbitrary 
random elements. If(£,r]i) and n^ are independent then 

E((\ m , V2 ) =E(C\vi) a.s. (10) 

Theorem 1. Let i Sj „ denote the Pitman estimator of 9 from a sample of 
size n from F s (x — 8), and t n denote the Pitman estimator from a sample 
of size n from F(x — 6). Under the only condition a 2 < oo, for any n > 1 
and any m with 1 < m < N , 

var(i n ) > ^ var(t s>n ) (11) 

\m— 1/ s 

where the summation is extended over all combinations sofm elements from 
{l,...,iV}. 

Proof. Set r = ( Z*) • From the definition (pQ) one has 

var(t n ) = a 2 jn - v&r{E(x\R)} = ^(al/n) - varjjS^^£fc|i^ |. 
Similarly, 

(l/r)^var(V) = (1/r) £ £ 0^/n - (1/r) £ var{F(f s |^)} 
s s fees s 

JV 

= ^(0l/n)-(l/r)^var{ J E(x s | J R s )}, 

1 s 

where the last equality is due to the fact that each k € {1, . . . , N} appears 
exactly r times in s. On setting (fr s = E(x s \R s ) and w s = ( ) for all s 
and noticing that so defined <j) s depends only on x&, k S s, one has by virtue 
of Lemma Q] 

£}var{£(x,|#,)} > varj J3s(x,|i2«)| (12) 



r 

s 
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Denote by s the complement of s in { 1 , . . . , N} . Then R s and R$ depend 
on disjoint sets of independent random vectors xi, . . . , xjv and thus are in- 
dependent. 

By virtue of Lemma [2j 

(/> s = E(x s \R s ,Rs). 

From the definition of the n-variate vectors R s and R$ one has R = R s + Rs- 
Now due to a well known property of the conditional expectation, 

E(x s \R) = E[E(x s \R s ,R- s )\R] = E[E(x s \R s )\R}. 

Since for any random variable £ and random element r\ 

var(£) > var{£ , (£|r?)}, 

the previous relation results in 

var{^£(x s |i? s )} > vai{E(^2 E{x s \R s )\R)} 



v a r{E(Y,E(x s \R s ,R- s )\R)} 

s 

vax(^2E(x 8 \R)} = var{E(^2x s \R)} 

s s 

N 

var{E(rJ2^k\R)} = r 2 var{E(x\R)}. (13) 



k=l 

Combining (TT2j) with (fTBj) leads to 

N 



r 

k=l s 



which is equivalent to the claimed result (jlip . □ 

It is of special interest to study the simple case where F± = . . . = Fn = 
H. This gives the monotonicity of var(i*^) with respect to the group number 
N, in contrast to (|28|) in Section whose monotonicity is with respect to 
the sample size n. 

Corollary 1. For any N > 1, if t^ N is the Pitman estimator of 6 from a 
sample of size n from H* N (x — 6) where H* N = H * ■ ■ ■ * H, then 

N ~ N- 1 [ ' 

Here n and N are independent parameters. 
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Proof. Choose m = N — 1 in Theorem [TJ Under the conditions of Corollary, 
(t ntS ) are equidistributed for all N combinations s of N — 1 elements so that 
(fTTj) becomes 

var(0 > ^var^- 1 )). 

□ 

Recall that for independent identically distributed Xi, . . . , X^, Artstein 
et. al. [I] showed that 

NI(X 1 + . . . + X N ) < (N - + . . . + X N ^) (16) 

for any N > 1. As shown in Ibragimov and Has'minskii [7], if I(X) < oo 
and J \x\ s dF(x) < oo for some 5 > 0, 

var(t n ) = — — -(1 + o(l)), n -> oo (17) 
ni (a J 

Thus the inequality (|15|) may be considered a small sample version of in- 
equality (|16|) for the Fisher information. Furthermore, note that the mono- 
tonicity (|16p of Fisher information follows from (|15p and (|17p . 

Another corollary of Theorem [T] is a dissipative property of the condi- 
tional expectation of the sample mean. 

Corollary 2. If F\ = . . . = Fn = H, then for any N > 1 

(N - l)Var{ J E(xi| J Ri + . . . + R N ^)} > iVVar{£;(xi|i?i + . . . + R N )}. (18) 

Proof. Since x±\, . . . ,XNn are independent identically distributed random 
variables, one has for any n and 

N N 

vax(t<fp) = vai{^2x k - E(^2x k \R 1 + ... + R N )} 

k=l k=l 

N N 



= varj^Xfc} - var{E(y]x k \Ri + ■■■ + Rn)} 

k=l k=l 

= a 2 /n-var{NE(xi\R 1 + ... + R N )} 
that combined with (|15p immediately leads to (|18j) . □ 
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Notice that (|18p is much stronger than monotonicity of var{^(xi|i?i + 
. . . + Rn)} that follows directly from 

var{£(xi| J Ri + ... + i? A r_i)} = vac{E(x 1 \R 1 + . . . + Rn-i,Rn)} 

> var{E(x 1 \R 1 + ... + R N )}, 

due to independence of (x~i,Ri, . . . , Rn-i) and xjv- 

3 A corollary and an analytical characterization 
problem related to the Pitman estimators 

Turn now to an elegant corollary of Theorem [TJ On setting m = 1 in 
Theorem [TJ the subsets s are reduced to one element each, s = {k}, k = 
1,...,N and one gets the superadditivity inequality from Kagan [10]: 

Corollary 3. Iftn\...,tn o,re the Pitman estimators from samples of 
size n from F\{x — 9), . . . , F^{x — 9), and t n is the Pitman estimator from 
a sample of size n from F(x — 9) where F = F\ * . . . * Fn, then 

N 

var(t n ) > ^var(4 fe) ). (19) 
k=l 

An interesting analytic problem, a Cauchy type functional equation for 
independent random variables, arises in connection to the relation 

N 

var(t n ) = J>ar(4 fc )). (20) 
k=l 

We will show below that with some conditions on F\ , . . . , Fn , (|20p is a 
characteristic property of Gaussian distributions. Note that to study the 
relation (I20p . it suffices to consider the case of N = 2. 

Let (xi, . . . , x n ), (yi, . . . , y n ) be independent samples from populations 
Fi(x—9i),F2(y—92), respectively, and let t' n and be the Pitman estimators 
of 9 1 and #2- The Pitman estimator of 9\ + 92 from the combined sample 
(xi,...,y n ) is 4 + C 

For the Pitman estimator t n of 9 from a sample of size n from population 
(F\ * F2){x — 9), consider t n (x\ + yi, . . . ,x n + y n )- This is an equivariant 
estimator of 9\ + 92 from the above combined sample, so that 

var(t n ) > var(i'J + var(^). (21) 
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Due to the uniqueness of the Pitman estimator, the equality sign in (|2ip 
holds if and only if 

t n (xi +yi,... ,x n + y n ) = t' n {x u .. . ,x n ) +t'n(yi, ...,y n ) (22) 

with probability one. This is a Cauchy type functional equation holding for 
random variables and is different from the classical Cauchy equation. 

It turns out that even in the simplest case of n = 1 when the equation 
is of the form 

f(X)+g(Y) = h(X + Y) (23) 

and X, Y are independent continuous random variables, solutions /, g of 
(123p may be nonlinear. 

Indeed, let £ be a uniform random variable on (0, 1). Consider its diadic 
representation 

oo t 
k=l 

where £i, £21 • • • are independent binary random variables with P(£fc = 0) = 
P(£ fe = 1) = .5. Now set 

X=\^^Y=^ — 

k even k odd 

Then X and Y are independent random variables with continuous (though 
singular) distributions and they both are functions of X + Y = £ (X and Y 
are strong components of £, in terminology of Hoffmann-Jorgensen ei. a/. 
[6]). Thus, for any measurable functions / and g, the relation ([23]) holds. 

On the other hand, if both X and Y have positive almost everywhere 
(with respect to the Lebesgue measure) densities and /, g are locally inte- 
grable functions, then the equation (j23|) has only linear solutions /, g (and 
certainly h). 

From positivity of the densities, one has 

f{x) + g(y) = h{x + y) (24) 

almost everywhere (with respect to the plane Lebesgue measure). On taking 
a smooth function k(x) with compact support, multiplying both sides of (I24D 
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oo 
+00 



00 
+00 



by k(x) and integrating over x, results in 

f(x)k(x)dx + g(y) 
h(x + y)k(x)dx 
h{u)k(u — y)du, 



where the right hand side is continuous in y. Thus, g(y) is continuous and 
so is f(x) implying that (|24|) holds for all (and not almost all) x, y (the idea 
is due to Hillel Furstenberg). 

Now ()24|) becomes the Cauchy classical equation that has only linear 
solutions. 

Returning to (|22p and noticing that E\t' n \ < 00, E\t'^\ < 00, one con- 
cludes that if F\ and F2 are given by almost everywhere positive densities, 
then for almost all (with respect to the Lebesgue measure in M 2n ) 

t' n (xi ...,x n ) +t'n(yi,.. .,y n ) = t n (xi +yi,.. .,x n + y n ). (25) 



Treating (|25p as a Cauchy type equation in Xi, yi with the remaining n — 1 
pairs of the arguments fixed, one gets the linearity of t' n , in each of their 
arguments whence due to the symmetry t' n = x, = y implying for n > 3 
that Fi and F2 are Gaussian. Thus, the following result is proved. 

Theorem 2. Let t§\..., t { n\ N > 1 are the Pitman estimators of 9 from 
samples of size n > 3 from populations F\(x — 8), . . . , F^(x — 9) with finite 
second moments and almost everywhere positive densities, and t n the Pitman 
estimator form a sample of size n from (F\ * ... * Fn)(x — 6). Then 

N 

var(i n ) = ^var(4 fc) ) 
1 

if and only if all the populations are Gaussian. 

4 Combining independent samples from different 
populations 

Let (x[ , . . . , Xn^), k = 1, . . . , N be independent samples of size m, . . . , njy 
from populations F\(x — 9), . . . , F^(x — 9) with finite variances and be 
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the Pitman estimator of 9 from the sample (x^\ . . . , Xn k ) of size n&. For s = 

(s) 

{i±, . . . ,i m }, we denote by t n ^ the Pitman estimator of 9 from the sample 
of size n(s) = + . . . + nj m that is obtained from pooling samples with 
superindices from s. By tn'"'' N ^ we denote the Pitman estimator of 9 from 
the sample (x[ 1 , . . . , of size n = ni + . . . +njv- Trivially, var^i 1 ' '" ,JV ^) 

(s) 

is the smallest among var(r ^). Here a stronger result is proved. 
Theorem 3. The following inequality holds: 

1 



> 



var(d 1 '-' Ar) ) (^„i) s varOjJy 

where the summation in IW\) is over all combinations s of m elements from 
{l,...,iV}. 

Proof. On setting in Lemma \T\ip s = t/, , and choosing the weights w s min- 

n \ s ) 

imizing the right hand side of ([8]), 

s 

where 7r s = l/var(r , A one gets 

V 7 ^ S var(t (s >) s 

v n(s) ' 

For sample (a;^ 1 , . . . , Xn N ), Es^il) * s an equivariant estimator while 
tn'"'' N ^ is the Pitman estimator. Thus 



var 



(1,...,7V)^ 



which, combined with the previous inequality, is exactly (|26p . □ 

(k) 

In a special case, when the subsets s consist of one element and in fc is 
the Pitman estimator from (a^ , . . . ,x^), Theorem [3] becomes 

>^m- + ---+ ( 2 7) 



var(4 1 '"' ,Ar> ) var(^7) var(t^^ 
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This inequality is reminiscent of Carlen's superadditivity for the trace of the 
Fisher information matrix, which involves the Fisher informations obtained 
by taking the limit as sample sizes go to infinity However, Carlen's super- 
additivity is true for random variables with arbitrary dependence, whereas 
(|27p has only been proved under assumption of independence of samples. 

5 Some corollaries, including the monotonicity of 

n var(t 7l ) 

Notice that if for a sample of size m from F(x — 9), var(t m ) < oo, then 
var(t n ) < oo for samples (x\, . . . , x n ) of any size n > m. 

Set F\ = ... = Fn = F, n\ = . . . = = 1, and m = N — 1 in 
Theorem [3l Then n(s) = N — 1 for each s with m elements, and n = N, 
and Theorem [3] reads 

1 1 V" 1 N 1 

var(^-^) " ^ V var^- 1 ^ 1 -"^) " * = 1 var(fc^) ' 

where the last equality is due to symmetry. Now t^'"'' N ^ is just the Pitman 
estimator of 9 from a sample of size TV" from F(x — 9). Thus, interpreting ./V 
as sample size instead of group size, we have the following result. 

Theorem 4. Let t n be the Pitman estimator of 9 from a sample of size 
n from a population F(x — 9). If for some m, var(i m ) < oo, then for all 
n > m + 1 

(n + l)var(t n+ i) < nvar(t n ). (28) 
For n>2, the equality sign holds if and only if F is Gaussian. 
Remarks: 

1. If F is Gaussian N(0, a 2 ), then clearly nvar(i n ) = a 2 for all n. In fact, 
the equality 

nvar(t„) = (n + l)var(t n+ i) 

holding for any n > 2 characterizes the Gaussian distribution since it 
implies the additive decomposability of t n . If an equivariant estimator 
is additively decomposable, it is linear and due to the symmetry of t n 
one has t n = x. 

2. The condition of Theorem [J] is fulfilled for m = 1 (and thus for any m) 
if J x 2 dF{x) < oo. However, for many F with infinite second moment 
(e.g., Cauchy), var(i m ) < oo for some m and Theorem H] holds. 
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3. Note that even absolute continuity of F is not required, not to mention 
the finiteness of the Fisher information. 

4. If F is the distribution function of an exponential distribution with 
parameter 1/A, 

, . 2Xn 

nvar (t n ) = - — -. 

Kn ' (n+l)(n + 2) 

If F is the distribution function of a uniform distribution on (—1, 1), 

. . 4n 2 
nvar(t n ) 



(n + l) 2 (n + 2)' 

In these examples, the Fisher information is infinite, but one clearly 
has monotonicity. 

5. One can call F Pitman regular if 

lim nvar(i n ) > (29) 

71— >OC 

and nonregular if the limit in (|29p (that always exists) is zero. As 
mentioned earlier, Ibragimov and Has'minskii [7] showed that under 
rather mild conditions on F that include the finiteness of the Fisher 
information /, 

lim nvar(i n ) = 1/J. 

n— >oo 

Under these conditions, Theorem [4] implies monotone convergence of 
nvai(t n ) to its limit. 

A corollary of Theorem 0] is worth mentioning. 

Corollary 4. Let (x±, . . . , x n+rn ), m+n > 3 be a sample from the population 
F(x — 9) with finite variance. If t m is the Pitman estimator of 9 from the 
first m and t n from the last n observations, then 

tn+m = W\t n + W2t m 

for some w\ , W2 if and only if F is Gaussian. 

Proof. One can easily see that necessarily w\ = mj (m + n), W2 = n/{m + n) 
so that 

var(t m+n ) = ( m ) var(t m ) + f — - — ) var(t n ) 
\m + n J \m + nj 

( m \ f n \ 

> ; — var(t m+n ) + ■ — var(t m+n ) = v&i(t m+n ), 

\m + nj \m + nj 

the equality sign holding if (m + n)var(t m+n ) = mvar(t m ) = nvar(£ n ). □ 
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We can now characterize equality for another special case of Theorem [3l 

Corollary 5. Let t m be the Pitman estimator from a sample of size m from 
F(x — 9). Then one has superadditivity with respect to the sample size, 

11 1 

> 7- — r + . . . H 77 — r, n = n\ + . . . + tin, (30) 



var(t n ) var(t ni ) v&i(t nN ) 
with equality if and only if F is Gaussian. 

Proof. Taking F\ = . . . = Fn = F in Theorem [3] immediately gives (|30p . 
To understand when the equality sign holds in (|30p . suffice to consider the 
case of = 2. Set n\ = I, n2 = m, n = I + m. The equality sign in 

1 1 1 

> 77T + 



var(i n ) var(tj) var(i m ) 
holds if and only if 

tn = w\ti + W2t m , with w\ = l/n, W2 = m/n. 

According to Corollary HI the last relation holds if and only if F is Gaussian. 

□ 

Another corollary of interest that looks similar in form to Corollary [2] of 
Section [2] but is of a different nature, follows immediately from combining 
Theorem |4] and the definition ([T]) . 

Corollary 6. For independent identically distributed Xi,X2, ■ ■ ■ with 
vav(Xi) = a 2 < oo set 

X n = (X 1 + ...+ X n )/n. 

Then for any n > 1 , 

(n+l)var£;(A > n+ i|A"i-A > n+ i,...,A A n+ i-A > n+ i) > nvavE(X n \X 1 -X n , . . . , X n - 
In the regular case when limn^oo nvar(t n ) = 1/1, 

lim nvar J E(X n |X 1 - X n , . . . , X n - X n ) = a 2 - 1/L 

rn>oo 

It would be interesting to study the asymptotic behavior as n — > oo of the 
random variable 

E(ypnX n \X\ — X n , ■ ■ ■ , X n — X n ). 
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6 Multivariate extensions 

An extension of Theorem [1] to the multivariate case depends on a generaliza- 
tion of the variance drop lemma (Lemma [1]) to the case of s-variate vector 
functions. Using the Cramer- Wold principle, for an arbitrary vector c?R s 
and vector functions ip s = ^ S (X S ), set 



where var means the covariance matrix; hence Lemma Q] holds in the multi- 
variate case if we interpret the inequality in terms of the Loewner ordering. 

In Theorem [H if X\ , . . . , X n are independent s-variate random vectors 
with distribution F{x — 8), x,9 £ ~R S , all the results and the proof remain 
true where an inequality A > B for matrices A, B means, as usual, that the 
matrix A — B is non-negative definite. 

Corollary [5] remains valid in the multivariate case when the above sam- 
ples come from s-variate populations depending on fl(!i s assuming that the 
covariance matrices of the involved Pitman estimators are nonsingular. The 
latter condition is extremely mild. Indeed, if the covariance matrix V of the 
Pitman estimator r n from a sample of size n from an s-variate population 
H(x — 6) is singular, then for a nonzero (column) vector a£R s 



(prime stands for transposition) meaning that the linear function a'8 is es- 
timatable with zero variance. This implies that any two distributions in 
M. ns generated by samples of size n from F(x — 9i) and F{x — 62) with 
a'Oi 7^ a'02 are mutually singular and so are the measures in ~R S with dis- 
tribution functions F(x — 6\) and F(x — 62)- Since for any Q\ there exists 
an arbitrarily close to it 62 with a'9\ ^ a'02, singularity of the covariance 
matrix of the Pitman estimator would imply an extreme irregularity of the 
family {F(x — 8), 9 £ M s }. In the multivariate case (\27\i takes the form of 



0s(X s ) = c T ip s (K s ). 



Thus Lemma Q] implies 




This is equivalent to 




var (a%) = a'Va = 0, 



(t^)>v-HtS)+...+v^ej) 



(31) 
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where V(t) is the covariance matrix of a random vector t. To prove (|3ip. 
take matrix-valued weights 

W k = (V'\t^) + ... + V-^tg?)) _1 V^ff), * = 1, • • • . N. (32) 

Since W\ + . . . + Wn is the identity matrix, W\tn} + . . . + Wjyt^ is an 
equivariant estimator of so that its covariance matrix exceeds that of the 
Pitman estimator, 

v($-M) < v (w lt w + . . . + w N tW) = W x V{t$)W[+. . .+W N V{tW)W>^ 

Substituting the weights ([32]) into the last inequality gives ([3T]) . 

If (x\, . . . ,x n ) is a sample from the multivariate population F(x — 0) 
(where both x and are vectors), the monotonicity of Theorem U holds for 
the covariance matrix V n of the Pitman estimator, i.e., 

nV n > (n+l)V n+1 . 

The proof is the same as that of the univariate case, but uses the multivariate 
version of Lemma [T] discussed at the beginning of this section. 



7 Extensions to polynomial Pitman estimators 

Assuming 

J x 2k dF(x) < oo (33) 

for some integer k > 1, the polynomial Pitman estimator tn of degree k is, 
by definition, the minimum variance equivariant polynomial estimator (see 
Kagan [H]). Let = Mk(x\ — x, . . . , x n — x) be the space of all polynomials 
of degree k in the residuals. Also, let E(-\Mk) be the projection into in 
the (finite-dimensional) Hilbert space of polynomials in x\, . . . , x n of degree 
k with the standard inner product 

{qi,q 2 ) = E(qiq 2 ). 

Then the polynomial Pitman estimator can be represented as 

4 fc) =x-E(x\M k ). (34) 

Plainly, it depends only on the first 2k moments of F. 

To extend our earlier results to the polynomial Pitman estimators in^ 
under the assumption f x 2k dF(x) < oo, the following properties of the pro- 
jection operators are useful: 
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1. For any index set s, 

M k (R n ) = M k (R s + R- s ) C M k (R s , R- s ) 
so that for any random variable £ 

var^M*^))} < v a r{E(Z\M k (R s ,R- s ))}. 

2. Let £ be a random variable such that the pair (£,R S ) is independent 
(actually, suffice to assume uncorrelatedness) of R$, then 

E(£\M k (R s ,R- s )) = E(d\M k (R s )). 

Substituting the conditional expectations in the proof of Theorem [1] by 
the projection operators E(-\M k ), the following version of Theorem [1] for 
polynomial Pitman estimators can be proved. 

Theorem 1'. If for some integer k > 1, J x 2k dFj(x) < oo, j = 1, . . . , N, 

the variance of the polynomial Pitman estimators ig,n satisfy the inequality 

var(4 fc ))>-^-^var(tg). 

\m—l) s 

Assuming that for some integer m > 1 

/" x 2m dF k {x) < oo, k = 1, . . . , AT, 

Corollary [5] also easily extends to the polynomial Pitman estimators of de- 
gree m. 

Similarly, under the condition (|33|) for some integer k > 1, the Theorem U] 
extends to the polynomial Pitman estimator defined in (|34|) . The poly- 
nomial Pitman estimator of degree A; from (x±, . . . , Xj+i, • • • , x n ) is 
equidistributed with t n _ 1 and thus 

var(tg)=var(41). 

The estimator * s equivariant (for sample (xi,...,x n )) and 

since in is the polynomial Pitman estimator, 

n 

var(tf))<(l/n 2 )var(^tg). (35) 

l 
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By the m = N — 1 special case of the variance drop lemma, 

n n 

var E© ^ ( n " !)E W@ = n(n - l)var(ti fe i 1 ). (36) 



1 l 
Combining the last two inequalities gives 

(n + l)var(ttj 1 )<nvar(4 fe )), (37) 

i.e., nvar(t^) decreases with n. 

In Kagan et. al. |13j it is shown that under only the moment condition 
433]), ?ivar(4 fc) ) -> l/I^ as n — > oo where I^ fc ^ can be interpreted as the 
Fisher information on 9 contained in the first 2k moments of F (see Kagan 
[T2]). For any increasing sequence k(n), one sees that nvax(tn ) decreases 
with n, and the limit can be equal to 1/1 under some additional conditions. 
Indeed, if the span of all the polynomials in X with distribution function F 
coincides with L 2 (F), the space of all square integrable functions of X, then 
jW -> I as k —> oo. 

The above proof of monotonicity is due to the fact that the classes where 

(k) 

t n and t n are the best are rather large. To illustrate this, consider the 

(k) 

following analog of tn . 

k 

T^ k > = x - E(x\l, m2, • • • , m k ) = x — aj )n rrij 

where rrij = (1/n) Yli( x i~ x ¥ an d E(x\l,ni2, ■ ■ ■ ,71%^) is the projection of x 
into the space span(l, m.2, . . . , m^) (i.e., the best mean square approximation 
of x by linear combinations of the sample central moments of orders up to 
k). As shown in Kagan el. al. [13], if / x 2k dF(x) < oo, the behavior of r n k ^ 

(k) 

as n — y oo is the same as of t n . 

where has a Gaussian distribution iV(0, 1/1^) and nvar(rn^) — >■ 1/I^ k \ 
However, it does not seem likely that (137]) holds for rA . 

8 Additive perturbations with a scale parameter 

In this section the setup of a sample (xi, . . . , x n ) from a population F\(x — 9) 
is considered where 

F A (x)= / F(y)dG((x-y)/X). 
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In other words, an observation X with distribution function F(x — 9) is 
perturbed by an independent additive noise XY with P(Y < y) = G{y). 

We study the behavior of the variance var(t nj \), as a function of A, of 
the Pitman estimator of 9 from a sample of size n from F\{x — 9). For 
the so called self-decomposable Y, it is proved that var(£ n ^) behaves "as 
expected", i. e., monotonically decreases for A G (— oo, 0) and increases for 
A E (0, +oo). 

They say that a random variable Y is self-decomposable if for any c £ 
(0, 1), Y is equidistributed with cY + Z c , i.e., 

Y = cY + Z c , (38) 

where Z c is independent of Y. If f(t) is the characteristic function of Y, 
then (f38|) is equivalent to 

/(*) = f(ct)g c (t) 

where g c (t) is a characteristic function. All random variables having stable 
distributions are self-decomposable. A self-decomposable random variable 
is necessarily infinitely divisible. In Lukacs [14} ?] necessary and sufficient 
conditions are given for self-decomposability in terms of the Levy spectral 
function. 

Theorem 5. Let X be an arbitrary random variable with E(X 2 ) < oo and 
Y a self- decomposable random variable with E(Y 2 ) < oo independent of X. 
Then the variance vax(t n \) of the Pitman estimator of 9 from a sample 
of size n from F\(x — 9), is increasing in A on (0, oo) and decreasing on 
(-oo,0). 

Proof. If xi, . . . ,x n , y\, — ,y n are independent random variables, the x's 
with distribution F(x — 9) and the y's with distribution G(y), then 

t n ,x = x + Xy - E(x + Ay|zi - x + A(y x - y), . . . , x n - x + X(y n - y)) 

and 

var(t„ iA ) = \&T{x+Xy)-vax{E{x+Xy\xi-x+X(y l -y), . . . ,x n -x+X(y n -y))}. 

If A2 > Ai > 0, then Ai = CA2 for some c, < c < 1. 

Due to self-decomposability of yi, there exist random variables z Cj i . . . , z Ci „ 
such that 

Vi-y = c(yi -y) + (z c ,i - z c ) (39) 
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and the random variables x\, . . . , x n ,yi, . . . , y n , z Cj i, . . . , z C)Tl are indepen- 
dent. 

The a-algebra 

a(xi - x + X 2 (yi - y), . . . ,x n - x + X 2 (y n - y)) = 
cr(xi -x + \ 2 c{yi -y) + A 2 (z Cj i -z c ),..., 
x n -x + X 2 c(y n -y) + X 2 c(y n - y) + A 2 (z Ci „ - z c )) 

is smaller than the <r-algebra 

a{x\ -x + X 2 c(yi - y), . . . , x n - x + X 2 c(y n - y),z c> i - z c , . . .,z c>n - z c ) 

and thus 

vw{E(x + \ 2 y\xi -x + \ 2 (yi - y), . . . , x n - x + \ 2 (y n - y))} < 
var{E(x + X 2 y\xi - x + A 2 c(yi - y), . . . , 
x n -x + X 2 c(y n - y),z C)1 -z c ,..., z C)n - z c )}. 

From (|39p and Lemma [2] in Section [2] one can rewrite the right hand side of 
the above inequality 

\ax{E(x + X 2 y\xi -x + X 2 c(yi -y),..., 

x n - x + X 2 c(y n - y), z c ,i - z c , . . . , z C)Tl - z c )} = 

\ax{E(x + X 2 cy\xi -x + X 2 c{yi - y), . . . , x n - x + X 2 c(y n - y)} + 

\ai{E(X 2 z c \z C) i - z c ,..., z CjTl - z c ).} (40) 

Again due to (JM]) 

var(x + A 2 y) = var(x + X 2 cy + X 2 z c ). 

Combining this with (|4(jp and recalling that cA 2 = Ai leads to 

var(t njA2 ) > var(t„ jAl ). 

The case of Ai < A 2 < is treated similarly. □ 

Theorem [5] has a counterpart in terms of the Fisher information: Let 
X, Y be independent random variables. IfY is self-decomposable, then I(X+ 
XY), as a function of X, monotonically increases on (— oo, 0) and decreases 
on (0, +oo). 

The proof is much simpler than that of Theorem [5j Let < A 2 = cX\ 
with < c < 1. Then X + X 2 Y = X + cX 2 Y + A 2 Z C where X, Y and Z c 
are independent and the claim follows from that for independent random 
variables £, rj, /(£ + rj) < /(£). 
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9 Discussion 



Few years ago Bulletin of the Institute of Mathematical Statistics published 
letters [2], and [19] whose authors raised a question of monotonicity in the 
sample size of risks of standard ("classical") estimators. Natural expecta- 
tions are that under reasonable conditions the mean square error, say, of the 
maximum likelihood estimator from a sample of size n + 1 is less than from 
a sample of size n. 

In this paper a stronger property of the Pitman estimator t n of a loca- 
tion parameter is proved. Not only var(i n ) monotonically decreases in n but 
var(i n+ i) < ^j-var(t n ). However, for another equivariant estimator of a 
location parameter, that is asymptotically equivalent to t n and has a "more 
explicit" form than t n , 

1 " 

nl ^-^ 

l 

where J is the Fisher score and / the Fisher information, monotonicity in n 
of var(i n ) is an open question. In a general setup, it is not clear what prop- 
erty of the maximum likelihood estimator is responsible for monotonicity of 
the risk when monotonicity holds. 

In a recent paper [9] was proved monotonicity in the sample size of the 
length of some confidence intervals. 

It seems as a challenge to find out when it is worth to make an extra obser- 
vation. 
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